feat: Add markdown table conversion pipeline with pulldown-cmark#180
Merged
thepagent merged 4 commits intoopenabdev:mainfrom Apr 13, 2026
Merged
Conversation
- Introduce pulldown-cmark as markdown parser for accurate table detection - Add TableMode config (code/bullets/off) via [markdown] section in config.toml - Convert detected tables before sending final content to Discord - Design as reusable pipeline for future multi-channel support Closes #178
- Use unicode-width crate for column width calculation (fixes CJK/emoji alignment) - Use saturating_sub for padding to prevent underflow - Handle inline markup inside table cells (bold, italic, strikethrough, link) - Convert SoftBreak/HardBreak to space inside cells - Fix trailing blank line after last row in bullets mode
f883b23 to
37ffc63
Compare
Collaborator
chaodu-agent
left a comment
There was a problem hiding this comment.
🔍 PR Review: #180 — Markdown Table Conversion Pipeline
✅ What's good
- ✅ Uses
pulldown-cmarkAST parser for table detection — no regex hacks - ✅ Non-table content preserved verbatim via byte offsets — no accidental reformatting
- ✅
unicode-widthfor correct CJK/emoji column alignment - ✅ Inline markup (bold/italic/link) properly stripped inside cells
- ✅
#[serde(default)]on config — fully backward compatible, existing configs just work - ✅ Low integration footprint — only 6 lines changed in
discord.rs
🟡 Should fix before merge
- Backtick in code mode —
Event::Codepreserves backticks, but the entire table is already inside a code block, so they render as literal characters. Strip them in code mode. - Chunking order —
convert_tables()runs before chunking. Code blocks make tables longer (fenced markers + padding). Verify downstream chunking still respects Discord's 2000-char limit.
🟠 Suggested improvements (can follow up)
- Empty cells in bullets mode —
if cell.is_empty() { continue; }causes inconsistent bullet counts across rows. Consider showing• Header: —instead. - Per-channel config —
TableModeis already passed as a parameter (architecture supports it), but config only has a global[markdown]section. No per-channel override yet. - Test coverage — Current tests only do
contains()spot checks. Add snapshot tests for full output verification, especially CJK alignment and multi-table documents. - Tables inside code blocks — Add a test confirming tables already inside
```fences are not double-processed.
Verdict
🟢 Clean architecture, focused implementation, minimal integration surface. Fix the two 🟡 items and this is LGTM to merge.
added 2 commits
April 13, 2026 07:00
Bring in upstream changes (STT support, image attachments, error display, copilot variant) while preserving the markdown table conversion pipeline introduced in this branch. Conflicts resolved in Cargo.toml, config.rs, discord.rs, and main.rs by keeping both feature sets.
- parse_segments now takes a mode parameter: in Code mode, Event::Code cells omit the backtick wrapping since the table is already inside a fenced code block and backticks would render as literal characters. Bullets mode keeps backticks as they are valid inline markdown. - split_message now tracks whether the cursor is inside a fenced code block (``` ... ```). When a chunk boundary falls mid-block, the current chunk is closed with ``` and the next chunk is reopened with ```, so each Discord message renders the code block correctly. - Tests added for both fixes.
chaodu-agent
approved these changes
Apr 13, 2026
Collaborator
chaodu-agent
left a comment
There was a problem hiding this comment.
🟢 Both merge blockers from the previous review have been addressed:
- Backtick in code mode —
parse_segmentsnow takesmode; backticks stripped in Code mode. Tests added. - Code-fence-aware chunking —
split_messageuseschars().count()(Unicode-safe), auto-closes/reopens fences across chunk boundaries. Streaming truncates instead of splitting mid-stream. Tests added.
LGTM ✅
thepagent
approved these changes
Apr 13, 2026
thepagent
added a commit
that referenced
this pull request
Apr 13, 2026
Restores release-pr.yml, tag-on-merge.yml, and ci.yml which were accidentally deleted by PR #180 during rebase.
thepagent
added a commit
that referenced
this pull request
Apr 13, 2026
Revert "feat: Add markdown table conversion pipeline with pulldown-cmark (#180)"
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #178
Markdown tables in LLM responses render poorly on chat platforms like Discord. This PR introduces a proper markdown parsing pipeline to detect and convert tables before sending messages.
Changes
src/markdown.rs(new): Core pipeline usingpulldown-cmarkto parse markdown and detect tables via AST tokens (not regex). Supports three rendering modes:code(default): Wraps tables in fenced code blocks with aligned columnsbullets: Converts each row into bullet points (• Header: Value)off: Pass-through, no conversionsrc/config.rs: AddedMarkdownConfigwithtables: TableModefieldsrc/discord.rs: Callsmarkdown::convert_tables()on final content before chunking and sendingsrc/main.rs: Registersmarkdownmodule and passes config to Discord handlerCargo.toml: Addedpulldown-cmark0.13 dependencyconfig.toml.example: Added[markdown]config sectionConfig
Existing configs without
[markdown]section will default tocodemode (no breaking change).Design
The pipeline is channel-agnostic:
markdown::convert_tables(text, mode)can be called from any future channel adapter with a differentTableModeper channel.